Short text similarity measurement methods: a review
نویسندگان
چکیده
Short text similarity measurement methods play an important role in many applications within natural language processing. This paper reviews the research literature on short (STS) method with aim to (i) classify and give a broad overview of existing techniques; (ii) find out its strengths weaknesses terms domain independence, requirement semantic knowledge, corpus training data, ability identify meaning, word order polysemy; (iii) knowledge resource that can be utilized for STS methods. Furthermore, our study also considers various issues such as difference between sources corpora similarity. Although there are few review papers this area, they focus mostly only one/two techniques. do not cover recent research. To best is comprehensive systematic topic. The findings follows: It identified four eight resources external classified into general-purpose domain-specific. techniques string-based, corpus-based, knowledge-based hybrid-based. Moreover, expert researchers utilize benchmark well reference limitations current identifies open considered feasible opportunities future directions.
منابع مشابه
Corpus-Based methods for Short Text Similarity
This paper presents corpus-based methods to find similarity between short text (sentences, paragraphs, ...) which has many applications in the field of NLP. Previous works on this problem have been based on supervised methods or have used external resources such as WordNet, British National Corpus etc. Our methods are focused on unsupervised corpus-based methods. We present a new method, based ...
متن کاملBenchmarking short text semantic similarity
Short Text Semantic Similarity measurement is a new and rapidly growing field of research. “Short texts” are typically sentence length but are not required to be grammatically correct. There is great potential for applying these measures in fields such as Information Retrieval, Dialogue Management and Question Answering. A dataset of 65 sentence pairs, with similarity ratings, produced in 2006 ...
متن کاملa review on similarity measurement methods in trust-based recommender systems
these days, due to growing the e-commerce sites, access to information about items is easier than past. but because of huge amount of information, we need new filtering techniques to find interested information faster and more accurate. therefore recommender systems (rs) introduced for solving this problem. although several recommender approaches have proposed, collaborative filtering (cf) appr...
متن کاملSimilarity Measures for Short Segments of Text
Measuring the similarity between documents and queries has been extensively studied in information retrieval. However, there are a growing number of tasks that require computing the similarity between two very short segments of text. These tasks include query reformulation, sponsored search, and image retrieval. Standard text similarity measures perform poorly on such tasks because of data spar...
متن کاملText-to-Text Semantic Similarity for Automatic Short Answer Grading
In this paper, we explore unsupervised techniques for the task of automatic short answer grading. We compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. Overall...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Soft Computing
سال: 2021
ISSN: ['1433-7479', '1432-7643']
DOI: https://doi.org/10.1007/s00500-020-05479-2